The University of Amsterdam at WebCLEF 2007: Using Centrality to Rank Web Snippets
نویسندگان
چکیده
We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the similarity functions used for centrality computations (word overlap and cosine similarity). We found that using paragraphs with the cosine similarity function shows the best performance with precision around 20% and recall around 25% according to human assessments of the first 7,000 bytes of responses for individual topics.
منابع مشابه
Using Centrality to Rank Web Snippets
We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the similarity functions used for centrality computations (word overlap and cosine similarity). We found ...
متن کاملREINA at WebCLEF 2007. Selecting Useful Snippets
The task for this year consist in retrieve snippets or pieces of text from web documents about several topics. The extraction of such snippets can be approached in several ways, as well as the selection of most usefull of them. We describe the segementation process adopted, and the selection of snippets carried out.
متن کاملOverview of WebCLEF 2008 (Draft)
We describe the WebCLEF 2008 task. Similarly to the 2007 edition of WebCLEF, the 2008 edition implements a multilingual “information synthesis” task, where, for a given topic, participating systems have to extract important snippets from web pages. We detail the task and the assessment procedure. At the time of writing evaluation results are not available yet.
متن کاملOverview of WebCLEF 2008
We describe the WebCLEF 2008 task. Similarly to the 2007 edition of WebCLEF, the 2008 edition implements a multilingual “information synthesis” task, where, for a given topic, participating systems have to extract important snippets from web pages. We detail the task, the assessment procedure, the evaluation measures and results.
متن کاملSegmentation of Web Documents and Retrieval of Useful Passages
This year’s WebCLEF task was to retrieve snippets and pieces from documents on various topics. The extraction and the choice of the most widely used snippets can be carried out using various methods. This article illustrates the segmentation process and the choice of snippets produced in this process. It also describes the tests carried out and their results.
متن کامل